# Chart Parsing
Hyperclovax SEED Vision Instruct 3B
Other
HyperCLOVAX-SEED-Vision-Instruct-3B is a lightweight multimodal model developed by NAVER, featuring image-text understanding and text generation capabilities, with special optimization for Korean language processing.
Text-to-Image
Transformers

H
naver-hyperclovax
160.75k
170
Mlcd Vit Bigg Patch14 448
MIT
MLCD-ViT-bigG is an advanced Vision Transformer model enhanced with 2D Rotary Position Encoding (RoPE2D), excelling in document understanding and visual question answering tasks.
Text Recognition
M
DeepGlint-AI
1,517
3
H2ovl Mississippi 800m
Apache-2.0
An 800M-parameter vision-language model from H2O.ai, specializing in OCR and document understanding with excellent performance
Image-to-Text
Transformers English

H
h2oai
77.67k
33
Fuyu 8b
Fuyu-8B is a multimodal text-image transformer developed by Adept AI, designed for digital agents, supporting arbitrary image resolutions with swift responses and a streamlined architecture.
Image-to-Text
Transformers

F
adept
14.22k
1,006
Featured Recommended AI Models